Statistical models of syntax learning and use
نویسندگان
چکیده
This paper shows how to define probability distributions over linguistically realistic syntactic structures in a way that permits us to define language learning and language comprehension as statistical problems. We demonstrate our approach using lexical-functional grammar (LFG), but our approach generalizes to virtually any linguistic theory. Our probabilistic models are maximum entropy models. In this paper we concentrate on statistical inference procedures for learning the parameters that define these probability distributions. We point out some of the practical problems that make straightforward ways of estimating these distributions infeasible, and develop a “pseudo-likelihood” estimation procedure that overcomes some of these problems. This method raises interesting questions concerning the nature of the data available to a language learner and the modularity of language learning and processing. © 2002 Cognitive Science Society, Inc. All rights reserved.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملIntegrating Topics and Syntax
Statistical approaches to language learning typically focus on either short-range syntactic dependencies or long-range semantic dependencies between words. We present a generative model that uses both kinds of dependencies, and can be used to simultaneously find syntactic classes and semantic topics despite having no representation of syntax or semantics beyond statistical dependency. This mode...
متن کاملLearning to Translate with Source and Target Syntax
Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on ...
متن کاملModels of Sequential Learning
The expression of music, language and physical motor skills share the need to execute well-learned plans of sequential behavior. We can think of each of these as governed by a set of syntactic principles that instantiate organizational rules. From a computational perspective, it has been frequently observed that a fair amount of apparently rule-driven behavior can be captured by simple statisti...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملLearning Syntax-Semantics Mappings to Bootstrap Word Learning
This paper addresses possible interactive effects between word learning and syntax learning at an early stage of development. We present a computational model that simulates how the results from a syntax learning process and a word learning process can be integrated to build syntax-semantics mappings, and how the emergence of links between syntax and word learning could facilitate subsequent wo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Cognitive Science
دوره 26 شماره
صفحات -
تاریخ انتشار 2002